A New Perspective on an Old Tool: Extending the Coverage of Sequence Similarity-Based Function Prediction with PFP
نویسندگان
چکیده
The ultimate aim of molecular biology is to define functional roles for all proteins. In the last decade, several methods have been developed which produce large amounts of data, including correlated expression analysis by microarrays, fast genome sequencing and large-scale proteomics screens. Computational biologists have been called upon by the experimental community to aid in the organization, analysis and interpretation of this data so that its utility may be maximized. Thus far, proposed computational gene function prediction methods can be grouped into four distinct categories: evolutionary methods, which use conserved global sequence or structure to imply homology and motifs to assign biochemical function and binding sites; genomic methods, which link proteins through domain fusion events, phylogenetic profiling, conserved gene order and common regulatory elements; cellular methods, which use large proteomics datasets to define protein-protein interaction patterns and complexes; and metabolic methods, which utilize the structured networks of biochemical pathways to match proteins to uncharacterized reactions. The most successful techniques combine clues from multiple contexts to make reliable predictions, but a major limit to function prediction is limited coverage. Even BLAST [1] searches can only cover half of the genes in a genome. In order to provide functional clues that can spark analysis of large proteomics datasets, we need a method that expands coverage by lowering prediction resolution, i.e. a method that can provide accurate (but more generalized) predictions for proteins falling outside of the coverage range for current techniques.
منابع مشابه
Link Prediction using Network Embedding based on Global Similarity
Background: The link prediction issue is one of the most widely used problems in complex network analysis. Link prediction requires knowing the background of previous link connections and combining them with available information. The link prediction local approaches with node structure objectives are fast in case of speed but are not accurate enough. On the other hand, the global link predicti...
متن کاملAUTOMATED FUNCTION PREDICTION Enhanced automated function prediction using distantly related sequences and contextual association by PFP
The impetus for the recent development and emergence of automated function prediction methods is an exponentially growing flood of new experimental data, the interpretation of which is hindered by a shortage of reliable annotations for proteins that lack experimental characterization or significant homologs in current databases. Here we introduce PFP, an automated function prediction server tha...
متن کاملEnhanced automated function prediction using distantly related sequences and contextual association by PFP.
The impetus for the recent development and emergence of automated function prediction methods is an exponentially growing flood of new experimental data, the interpretation of which is hindered by a shortage of reliable annotations for proteins that lack experimental characterization or significant homologs in current databases. Here we introduce PFP, an automated function prediction server tha...
متن کاملPFP/ESG: automated protein function prediction servers enhanced with Gene Ontology visualization tool
UNLABELLED Protein function prediction (PFP) is an automated function prediction method that predicts Gene Ontology (GO) annotations for a protein sequence using distantly related sequences and contextual associations of GO terms. Extended similarity group (ESG) is another GO prediction algorithm that makes predictions based on iterative sequence database searches. Here, we provide interactive ...
متن کاملEnhanced Sequence-Based Function Prediction Methods and Application to Functional Similarity Networks
After reviewing the underlying framework required for computational function prediction in the previous chapter, we discuss two advanced sequencebased function prediction methods developed in our group, namely the Protein Function Prediction (PFP) method and the Extended Similarity Group (ESG) method. PFP extends the traditional homology search by incorporating functional associations between p...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005